An Ai Planning Approach for Generating Big Data Workflows
نویسندگان
چکیده
The scale of big data causes the compositions of extract-transform-load (ETL) workflows to grow increasingly complex. With the turnaround time for delivering solutions becoming a greater emphasis, stakeholders cannot continue to afford to wait the hundreds of hours it takes for domain experts to manually compose a workflow solution. This paper describes a novel AI planning approach that facilitates rapid composition and maintenance of ETL workflows. The workflow engine is evaluated on real-world scenarios from an industrial partner and results gathered from a prototype are reported to demonstrate the validity of the approach.
منابع مشابه
Big Data Exploration Via Automated Orchestration of Analytic Workflows
Large-scale data exploration using Big Data platforms requires the orchestration of complex analytic workflows composed of atomic analytic components for data selection, feature extraction, modeling and scoring. In this paper, we propose an approach that uses a combination of planning and machine learning to automatically determine the most appropriate data-driven workflows to execute in respon...
متن کاملAn ontology-based framework for bioinformatics workflows
The proliferation of bioinformatics activities brings new challenges - how to understand and organise these resources, how to exchange and reuse successful experimental procedures, and to provide interoperability among data and tools. This paper describes an effort toward these directions. It is based on combining research on ontology management, AI and scientific workflows to design, reuse and...
متن کاملAutomatic Composition of Secure Workflows
Automatic goal-driven composition of information processing workflows, or workflow planning, has become an active area of research in recent years. Various workflow planning methods have been proposed for automatic application development in systems like Web services, stream processing and grid computing based on compositional architectures. Significant progress has been made on the development...
متن کاملImproving the Execution of KDD Workflows Generated by AI Planners
PDM is a distributed architecture for automating data mining (DM) and knowledge discovery processes (KDD) based on Artificial Intelligence (AI) Planning. A user easily defines a DM task through a graphical interface specifying the dataset, the DM goals and constraints, and the operations that could be used within the DM process. Then, the tool automatically obtains all the possible models that ...
متن کاملAutomated Data Management Workflow Generation with Ontologies and Planning
When working with data management systems, it is often required to specify and model the data within the system. Ontologies are a widely used method of defining data and its relations within a system by defining concepts, properties, and roles. In addition to the definition of the data structures, workflows are required for adding new knowledge to the ontologies. We will show how AI planning ca...
متن کامل